BACKGROUND ON THE EXAMPLE IFF SOURCE CODE Jerry Morrison, 1/30/86 The example IFF code is written using a programming style and techniques that may be unfamiliar to you. So here's a tutorial on "call-back procedures","enumerators", "interfaces", and "sub-classed structures". I recommend these programming practices independently of IFF software. DEFINITIONS: "CLIENT" VS. "USER" First, some definitions. The word "user" is reserved for a human user of a software package. That's you and me. A "client" of a software package, on the other hand, is a piece of software that uses that software package. A program that calls operating system routines such as "OpenFile" is a client of that operating system. CALL-BACK PROCEDURES Consider an operating system subroutine "ListDir" that lists the files in a disk directory. It might allow you to list just the filenames matching a pattern like "a*.text". Maybe you can ask it to list just the files created since yesterday ... or those longer than 2000 bytes. ListDir is a fancy, general-purpose directory subroutine that lets you pass in a number of arguments to filter the listing. A C definition might look like: void ListDir(directory, namePattern, minSize, maxSize, minDate ...); ... { for (each file in the directory) if ( PatternMatch(namePattern, filename) && fileSize >= minSize && fileSize <= maxSize && fileDate >= minDate && ... ) printf("%s\n", filename); /* probably fancier than this... */ } and your call to it: ListDir(myDir, "a*.text", 0, maxFileSize, date1_1_1900, ...); When you think about it, these filtering arguments make up a special-purpose "file filtering language". The person who designed this subroutine "ListDir" might be pretty pleased with his accomplishment. But in practice he can never put in enough features into this special-purpose language to satisfy everyone. (You say you need to list just the files currently open?) And he may have provided a lot of functionality that is rarely needed. Is this filtering language what he should spending his time designing, writing, and debugging? A much better technique is to use a "call-back procedure". The concept is simple: instead of all those filter arguments to ListDir, you pass it a pointer to a "filter procedure". ListDir simply calls your procedure (via the pointer) to do the filtering, once per file. It passes each filename to your "filter proc", which returns "TRUE" to include that file in the listing or "FALSE" to skip it. typedef BOOL FilterProc(); /* FilterProc: a BOOL procedure */ void ListDir(directory, filterProc); Directory directory; FilterProc *filterProc; { for (each file in the directory) if ( (*filterProc)(filename) ) printf("%s\n", filename); } and your code: BOOL MyFilterProc(filename) STRING filename; { return(PatternMatch("a*.text", filename)); } ... ListDir(myDir, MyFilterProc); This technique has many advantages. It gives unlimited flexibility to ListProc. It means you can use a general-purpose programming language instead of learning a special-purpose filtering language. It's more efficient to call a compiled subroutine than to "interpret" the filtering parameters. And it means you can do anything you want in a filter proc, from selecting files on the basis of numerology to copying files to backup tape. In practice, ListDir would have data about each file readily available. So it should pass this data to the filter proc to save time. As Alan Kay once said, "Simple things should be simple and complex things should be possible." STANDARD CALL-BACK PROCEDURE I could extend ListDir to accept a NULL FilterProc pointer to mean "list all files". More likely, I'd supply a standard call-back procedure "FilterTRUE" that always returns TRUE. Then ListDir(directory, FilterTRUE) will list all files with no special test for filterProc == NULL. BOOL FilterTRUE(filename) STRING filename; { return(TRUE); } ENUMERATORS Let's take our ListDir example one step further. Rather than have ListDir print the selected filenames, have it JUST call your custom proc for every file. Let your custom proc print the filenames, maybe in your own personal format. Or maybe have it quietly backup new files, or ask the user which ones to delete, or ... typedef CallBackProc(/* filename */); void ListDir(directory, callBackProc); Directory directory; CallBackProc *callBackProc; { for (each file in the directory) (*callBackProc)(filename); } and your code: void MyProc(filename) STRING filename; { if ( PatternMatch("a*.text", filename) ) printf("%s\n", filename); } ... ListDir(myDir, MyProc); Now we're talking about a full-blown "enumerator". The procedure "ListDir" is said to "enumerate" all the files in a directory. It "applies" your call-back procedure to each file. The enumerator scans the directory and your call-back procedure processes the files. It deals with the internal directory details and you deal with the printout. A nice separation of concerns. ListDir should come with a standard call-back procedure "PrintFilename" that lists the filename. By simply passing PrintFilename to ListDir, you can print a directory. By writing a call-back procedure that selectively calls the PrintFilename, you can filter the listing. void PrintFilename(filename) STRING filename; { printf("%s\n", filename); } ENUMERATION CONTROL A simple enhancement is to empower the call-back procedure to stop the enumeration early. That's easy. Have it return "TRUE" to stop. This is very handy, for example, to quit when you find what you're looking for. Let's expand this boolean "continue/stop" result into an integer error code. #define OKAY 0 #define DONE -1 typedef int CallBackProc(/* filename */); int ListDir(directory, callBackProc); Directory directory; CallBackProc *callBackProc; { int result = OKAY; for (each file in the directory) while (result == OKAY) result = (*callBackProc)(filename); return(result); } IFF FILE ENUMERATOR Now we'll relate these techniques to the example IFF code. I'm assuming that you've read "EA IFF 85" Standard for Interchange Format Files. That memo is available from Commodore as part of their Amiga documentation. Also ask Commodore for "ILBM" IFF Interleaved Bitmap and the example IFF source code. Two things make IFF files very flexible for lots of interchange between programs. First, file formats are independent of RAM formats. That means you have to do some conversion when you read and write IFF files. Second, the contents are stored in chunks according to global rules. That means you have to parse the file, i.e. scan it and react to what's actually there. In the example IFF files IFF.H and IFFR.C, the routines ReadIFF, ReadIList, & ReadICat are enumeration procedures. ReadIFF scans an IFF file, enumerating all the "FORM", "LIST", "PROP", and "CAT" chunks encountered. ReadIList & ReadICat enumerate all the chunks in a LIST and CAT, respectively. A ClientFrame record is a bundle of pointers to 4 "call-back procedures" getList, getProp, getForm, and getCat. These 4 procedures are called by ReadIFF, ReadIList, and ReadICat when the 4 kinds of IFF "groups" are encountered: "LIST", "PROP", "FORM", or "CAT". These 3 enumerator procedures and 4 client procedures together make up a reader for IFF files--a very simple recursive descent parser. If you want to learn more about parsing, a real good place to look is the new edition "dragon book" by Aho, Ullman, and Sethi. The procedure "SkipGroup" is just a default call-back procedure. The "IFFP" values IFF_OKAY through BAD_IFF are the error codes used by the IFF enumerators. We use the type "IFFP" to declare variables (and procedure results) that hold such values. The code "IFF_OKAY" means "AOK; keep enumerating". The other values mean "stop" for one reason or other. "IFF_DONE" means "we're all done", while "END_MARK" means "we hit the end at this nesting level". CALL-BACK PROCEDURE STATE ListDir is an enumerator with some internal state--it internally remembers its place in the directory. It loops over the directory, calling the client proc once per file. That's fine for some cases and less convenient for others. Consider this example that just lists the first 10 files: int count; int PrintFirst10(filename) STRING filename; { if (++count > 10) return(DONE); printf("%s\n", filename); return(OKAY); } void DoIt(); { ... count = 0; ListDir(myDir, PrintFirst10); ... } Inherently, the client's code has to be split into code that calls the enumerator and a call-back procedure. Thus any communication between the two must be via global variables. In this trivial example, the global "count" saves state data between calls to PrintFirst10. Often, it's much more complex. But globals won't work if you need reenterent or recursive code. We really want "count" to be a local variable of DoIt. Fixing this in Pascal is easy: Define PrintFirst10 as a nested procedure within DoIt so it can access DoIt's local variables. The manual analog in C is to redefine the enumerator to pass a raw "client data pointer" straight through to the call-back procedure. The two client procedures then communicate through the "client data pointer". DoIt would call ListDir(myDir, PrintFirst10, &count) which calls PrintFirst10(filename, &count). #define OKAY 0 #define DONE -1 typedef int CallBackProc(/* filename, clientData */); int ListDir(directory, callBackProc, clientData); Directory directory; CallBackProc *callBackProc; BYTE *clientData; { int result = OKAY; for (each file in the directory) while (result == OKAY) result = (*callBackProc)(filename, clientData); return(result); } In general, an enumerator is sometimes inconvenient because it takes over control. Think about this: How could you enumerate two directories in parallel and copy the newer files from one directory to the other? STATELESS ENUMERATOR An alternate form without this disadvantage is the "stateless enumerator". In a stateless enumerator, it's up to the client to keep its place in the enumeration. Call a procedure like GetNextFilename each time around the loop. STRING curFilename = NULL; int count = 0; do { if (++count > 10) break; /* stop after 10 files */ curFilename = GetNextFilename(directory, curFilename); if (curFilename == NULL) break; /* stop at end of directory */ printf("%s\n", filename); } The stateless enumerator is sometimes better because it puts the client in control. The above example shows how easy it is to keep state information between iterations and to stop the enumeration easy. It's also easy to do things like list two directories in parallel. IFF CHUNK ENUMERATOR The following IFFR.C routines make up a stateless IFF chunk enumerator: OpenRIFF, OpenRGroup, GetChunkHdr and CloseRGroup. Together with IFFReadBytes, we havm. It handles whatever it finds, unlike inflexible file readers that demand conformance to a rigid file format. [Note: This code doesn't check for errors or end-of-context.] OpenRGroup(..., context); /* initialize */ do { id = GetChunkHdr(context); /* get the next chunk's ID */ switch (id) { case AAAA: {read in an AAAA chunk; break}; case BBBB: {read in a BBBB chunk; break}; ... default: {}; /* just ignore unrecognized chunks */ } CloseRGroup(context); /* cleanup */ GetChunkHdr reads the next chunk header and returns its chunk ID. You then dispatch on the chunk ID, that is, switch to a different piece of code for each type of chunk. If you don't recognize the chunk ID, just keep looping. In each "case:" statement, call IFFReadBytes one or more times to read the chunk's contents. The readin work you do here depends on the chunk type and what you need in RAM. Since GetChunkHdr automatically skips to the start of the next chunk, it doesn't matter if you don't read all the data bytes. GetChunkHdr does some other things for you automatically. When it reads a "group" chunk header (a chunk of type "FORM", "LIST", "CAT ", or "PROP") it automatically reads the subtype ID. That makes it very convenient to just open the contents of the group chunk as a group context and read the nested chunks. See the example source program ShowILBM for more about the relationship between a "GroupContext" and a "ClientFrame". Like all the example IFF code, GetChunkHdr checks for errors. To handle GetChunkHdr errors, we just add cases to the switch statment. To stop at end-of-context or an error in a switch case, we add a "while" clause at the end of the "do" statement. CLIENTS, INTERFACES, AND IMPLEMENTORS In the ListDir example, you can see that a lot of flexibility comes from decoupling the task of tracing through the directory's data structures from the task of filtering files and printing filenames. This is called modularity, or simply, dividing a program into parts. Choosing good module boundaries is an art. It has a big impact on a programmer's ability to coope with lrge programs. Good modularity makes programs much easier to understand and modify. But this topic would be another whole tutorial in itself. Just be aware that the example IFF program is divided into various "modules", each of which implements a different part of the bigger picture. One such module is the low level IFF reader/writer. It's split into two files IFFR.C and IFFW.C. Other such modules are the run encoder/decoder Packer.C and UnPacker.C, and ILBM read/write subroutines ILBMR.C and ILBMW.C. You'll notice that all three of these "modules" are split into a pair of files. That's because most linkers aren't fancy enough to automatically eliminate unused subroutines, e.g. for a program like ShowILBM that reads but doesn't need the writer code. Also, a program like DeluxePaint wants read and write code in separate overlays. So think of each pair as a single module. What I want to point out is the basic structure. Each "module" has an "interface" file (a .H file) that separates the "implementor" .C file(s) from the "client" programs. This interface is very important, in fact, more important than the code details inside the .C files. The interfaces for the above-mentioned modules are called IFF.H, Packer.H, and ILBM.H. Everything about a layer of software that the clients need to know belongs in its interface: constant and type definitions, extern declarations for the procedures, and comments. The comments detail the purpose of the module and each procedure, the procedure arguments, side effects, results, and error codes, etc. Nothing the clients don't need to know belongs in its interface: internal implementation details that might change. Thus, the modularization and other important design information is collected and documented in these interface files. So if you want to understand what a module does and how to use it, READ ITS INTERFACE. Don't dive headfirst into the implementation. Two of the original articles on modular programming are D.L. Parnas, "On the Criteria To Be Used in Decomposing Systems into Modules". Communications of the ACM 15, 12 (Dec. '72), pp 1053-1058. B. Liskov and S. Zilles, "Programming with Abstract Data Types". Proceedings ACM SIGPLAN Conference on Very High-Level Languages. SIGPLAN Notices 9, 4 (April '74), pp 50-59. SUBCLASSED STRUCTURES One more technique. In programming, a general-purpose module may define a structure like ClientFrame. Along comes a more special-purpose program that needs a structure like it but with specialized fields added on. The answer is to build a larger structure whose first field is the earlier structure. This is called "subclassing" a structure, a term that comes from subclassing in Smalltalk. In the Macintosh(tm) toolbox, the record GrafPort is subclassed to produce the record WindowRecord, which is subclassed again to produce a DialogWindow record. Similarly in the example IFF program ShowILBM, the structure ClientFrame is subclassed to produce the more specialized structure ILBMFrame. typedef struct { ClientFrame clientFrame; UBYTE foundBMHD; ... } ILBMFrame; Since the first field of an ILBMFrame is a ClientFrame, the ShowILBM procedure ReadPicture can coerce a *ClientFrame pointer to an *ILBMFrame pointer to pass it to ReadIFF (which knows nothing about ILBMFrame). When ReadIFF calls back ShowILBM's getForm procedure, we can coerce it back to an *ILBMFrame pointer. Take a look at ShowILBM to see how this works.